In this example we will examine multivariate forecasting models using
mvgam, which fits GAMs using MCMC sampling via the
JAGS software (Note that JAGS is required;
installation links are found here).
First a simulation experiment to determine whether mvgam‘s
inclusion of complexity penalisation works by reducing the number of
un-needed dynamic factors. In any factor model, choosing the appropriate
number of factors K can be difficult. The approach used by
mvgam is to estimate a penalty for each factor that
squeezes the factor’s variance toward zero, effectively forcing the
factor to evolve as a flat white noise process. By allowing each
factor’s penalty to be estimated in an exponentially increasing manner
(following Welty, Leah J., et al. Bayesian distributed lag models:
estimating effects of particulate matter air pollution on daily
mortality Biometrics 65.1 (2009): 282-291), we hope that we can guard
against specying too large a K. But we should test this
property to see how it behaves in different scenarios. Begin by
simulating 6 series that evolve with a shared seasonal
pattern and that depend on 2 latent random walk factors.
Each series is 80 time steps long, with a seasonal
frequency of 12. We give the trend moderate importance by
setting trend_rel = 0.6 and we allow each series’
observation process to be drawn from slightly different Poisson
distributions
set.seed(500)
library(mvgam)
## Loading required package: mgcv
## Loading required package: nlme
## This is mgcv 1.8-33. For overview type 'help("mgcv-package")'.
## Loading required package: parallel
dat <- sim_mvgam(T = 80, n_series = 6, n_lv = 2,
family = "poisson", mu_obs = runif(8,
4, 6), trend_rel = 0.6, train_prop = 0.8)
## Warning: replacing previous import 'lifecycle::last_warnings' by
## 'rlang::last_warnings' when loading 'tibble'
## Warning: replacing previous import 'lifecycle::last_warnings' by
## 'rlang::last_warnings' when loading 'pillar'
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
Have a look at the series
par(mfrow = c(3, 2))
for (i in 1:6) {
plot(dat$data_train$y[which(as.numeric(dat$data_train$series) ==
i)], type = "l", ylab = paste("Series",
i), xlab = "Time")
}
par(mfrow = c(1, 1))
Clearly there are some correlations in the trends for these series.
But how does a dynamic factor process allow us to potentially capture
these dependencies? The below example demonstrates how. Essentially, a
dynamic factor is an unobserved (latent) random process that
induces correlations between time series via a set of factor loadings
(\(\beta\)) while exercising dimension
reduction. The loadings represent constant associations between the
observed time series and the dynamic factor, but each series can still
deviate from the factor through its error process and its associations
with other factors (if we estimate >1 latent factor in
our model).
A challenge with any factor model is the need to determine the number
of factors K. Setting K too small prevents
temporal dependencies from being adequately modelled, leading to poor
convergence and difficulty estimating smooth parameters. By contrast,
setting K too large leads to unnecessary computation.
mvgam approaches this problem by formulating a prior
distribution that enforces exponentially increasing penalties on the
factor variances to allow any un-needed factors to evolve as flat lines.
Now let’s fit a well-specified model for our simulated series in which
we estimate random intercepts, a shared seasonal cyclic smooth and
2 latent dynamic factors
mod1 <- mvgam(data_train = dat$data_train,
data_test = dat$data_test, formula = y ~
s(series, bs = "re") + s(season,
bs = c("cc"), k = 12), knots = list(season = c(0.5,
12.5)), use_lv = T, n_lv = 2, family = "poisson",
trend_model = "RW", chains = 4, burnin = 1000)
Look at a few plots. The estimated smooth function
plot_mvgam_smooth(object = mod1, series = 1,
smooth = "season")
And the true seasonal function in the simulation
plot(dat$global_seasonality[1:12], type = "l")
Check whether each factor was retained using the
plot_mvgam_factors function. Here, each factor is tested
against a null hypothesis of white noise by calculating the sum of the
factor’s 1st derivatives. A factor that has a larger contribution to the
series’ latent trends will have a larger sum, both because that factor’s
absolute magnitudes will be larger (due to the weaker penalty on the
factor’s precision) and because the factor will move around more. By
normalising these estimated first derivative sums, it should be apparent
whether some factors have been dropped from the model. Here we see that
each factor is contributing to the series’ latent trends, and the plots
show that neither has been forced to evolve as white noise
plot_mvgam_factors(mod1)
Now we fit the same model but assume that we no nothing about how
many factors to use, so we specify the maximum allowed (the total number
of series; 8). Note that this model is computationally more
expensive so it will take longer to fit
mod2 <- mvgam(data_train = dat$data_train,
data_test = dat$data_test, formula = y ~
s(series, bs = "re") + s(season,
bs = c("cc"), k = 12), knots = list(season = c(0.5,
12.5)), use_lv = T, n_lv = 6, family = "poisson",
trend_model = "RW", chains = 4, burnin = 1000)
The same plots as model 1 to see if this model has also fit the data well
plot_mvgam_smooth(object = mod2, series = 1,
smooth = "season")
plot(dat$global_seasonality[1:12], type = "l")
Examining the factor contributions gives us some insight into whether
we set n_lv larger than we perhaps needed to. These
contributions can be interpreted similarly to ordination axes when
deciding how many latent variables to specify
plot_mvgam_factors(mod2)
The very weak contributions by some of the factors are a result of
the penalisation, which will become more important as the dimensionality
of the data grows. Now onto an empirical example. Here we will access
monthly search volume data from Google Trends, focusing on
relative importances of search terms related to tick paralysis in
Queensland, Australia
library(tidyr)
if (!require(gtrendsR)) {
install.packages("gtrendsR")
}
terms = c("tick bite", "tick paralysis",
"dog tick", "paralysis tick dog")
trends <- gtrendsR::gtrends(terms, geo = "AU-QLD",
time = "all", onlyInterest = T)
Google Trends modified their algorithm for extracting
search volume data in 2012, so we filter the series to only include
observations after this point in time
gtest <- trends$interest_over_time %>% tidyr::spread(keyword,
hits) %>% dplyr::select(-geo, -time,
-gprop, -category) %>% dplyr::mutate(date = lubridate::ymd(date)) %>%
dplyr::mutate(year = lubridate::year(date)) %>%
dplyr::filter(year > 2012) %>% dplyr::select(-year)
Convert to an xts object and then to the required
mvgam format, holding out the final 10% of observations as
the test data
series <- xts::xts(x = gtest[, -1], order.by = gtest$date)
trends_data <- series_to_mvgam(series, freq = 12,
train_prop = 0.9)
Plot the series to see how similar their seasonal shapes are over time
plot(series, legend.loc = "topleft")
Now we will fit an mvgam model with shared seasonality
and random intercepts per series. Our first attempt will ignore any
temporal component in the residuals so that we can identidy which GAM
predictor combination gives us the best fit, prior to investigating how
to deal with any remaining autocorrelation. We assume a Negative
Binomial distrubution with series-specific overdispersion parameters for
the response. We use a complexity-penalising prior for the
overdispersion, which allows the model to reduce toward a more simple
Poisson observation process unless the data provide adequate information
to support overdispersion. Also note that any smooths using the random
effects basis (s(series, bs = "re") below) are
automatically re-parameterised to use the non-centred
parameterisation that is necessary to help avoid common posterior
degeneracies in hierarchical models. This parameterisation tends to
work better for most ecological problems where the data for each group /
context are not highly informative, but it is still probably worth
investigating whether a centred or even a mix of centred / non-centred
will give better computational performance. We suppress the global
intercept as it is not needed and will lead to identifiability issues
when estimating the series-specific random intercepts
trends_mod1 <- mvgam(data_train = trends_data$data_train,
data_test = trends_data$data_test, formula = y ~
s(series, bs = "re") + s(season,
k = 12, m = 2, bs = "cc") - 1,
knots = list(season = c(0.5, 12.5)),
trend_model = "None", family = "nb",
chains = 4, burnin = 8000)
Given that these series could potentially be following a hierarchical seasonality, we will also trial a slghtly more complex model with an extra smoothing term per series that allows its seasonal curve to deviate from the global seasonal smooth. Ignore the warning about repeated smooths, as this is not an issue for estimation.
trends_mod2 <- mvgam(data_train = trends_data$data_train,
data_test = trends_data$data_test, formula = y ~
s(season, k = 12, m = 2, bs = "cc") +
s(season, series, k = 5, bs = "fs",
m = 1), knots = list(season = c(0.5,
12.5)), trend_model = "None", family = "nb",
chains = 4, burnin = 8000)
How can we compare these models to ensure we choose one that performs
well and provides useful inferences? Beyond posterior retrodictive and
predictive checks, we can take advantage of the fact that
mvgam fits an mgcv model to provide all the
necessary penalty matrices, as well as to identify good initial values
for smoothing parameters. Because we did not modify this model by adding
a trend component (the only modification is that we estimated
series-specific overdispersion parameters), we can still employ the
usual mgcv model comparison routines
anova(trends_mod1$mgcv_model, trends_mod2$mgcv_model,
test = "LRT")
AIC(trends_mod1$mgcv_model, trends_mod2$mgcv_model)
summary(trends_mod1$mgcv_model)
##
## Family: Negative Binomial(16.101)
## Link function: log
##
## Formula:
## y ~ s(series, bs = "re") + s(season, k = 12, m = 2, bs = "cc") -
## 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(series) 3.999 4 26362 <2e-16 ***
## s(season) 6.672 10 86785 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.769 Deviance explained = 80%
## -REML = 1291.5 Scale est. = 1 n = 412
summary(trends_mod2$mgcv_model)
##
## Family: Negative Binomial(23.491)
## Link function: log
##
## Formula:
## y ~ s(season, k = 12, m = 2, bs = "cc") + s(season, series, k = 5,
## bs = "fs", m = 1)
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.614 0.437 5.982 2.2e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(season) 6.346 10 119.8 <2e-16 ***
## s(season,series) 12.652 19 1153.7 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.813 Deviance explained = 83.3%
## -REML = 1265.4 Scale est. = 1 n = 412
Model 2 seems to fit better so far, suggesting that hierarchical
seasonality gives better performance for these series. But a problem
with both of the above models is that their forecast uncertainties will
not increase into the future, which is not how time series forecasts
should behave. Here we fit Model 2 again but specifying a time series
model for the latent trends. We assume the dynamic trends can be
represented using latent factors that each follow an AR1 process, and we
will rely on the exponential penalties to help regularise any un-needed
factors by setting n_lv = 4
trends_mod3 <- mvgam(data_train = trends_data$data_train,
data_test = trends_data$data_test, formula = y ~
s(season, k = 12, m = 2, bs = "cc") +
s(season, series, k = 5, bs = "fs",
m = 1), knots = list(season = c(0.5,
12.5)), trend_model = "AR1", use_lv = T,
n_lv = 4, family = "nb", chains = 4,
burnin = 8000)
Have a look at the returned JAGS model file to see how
the dynamic factors are incorporated. Notice that the precisions of
factors, together with each factor’s set of loadings, are penalised
using a regularised horseshoe prior
trends_mod3$model_file
## [1] "JAGS model code generated by package mvgam"
## [2] ""
## [3] "GAM formula:"
## [4] "y ~ s(season, k = 12, m = 2, bs = cc) + s(season, series, k = 5, bs = fs, m = 1)"
## [5] ""
## [6] "Trend model:"
## [7] "AR1"
## [8] ""
## [9] "Required data:"
## [10] "integer n; number of timepoints per series"
## [11] "integer n_series; number of series"
## [12] "matrix y; time-ordered observations of dimension n x n_series (missing values allowed)"
## [13] "matrix ytimes; time-ordered n x n_series matrix (which row in X belongs to each [time, series] observation?)"
## [14] "matrix X; mgcv GAM design matrix of dimension (n x n_series) x basis dimension"
## [15] "matrix S1; mgcv smooth penalty matrix S1"
## [16] "vector zero; prior basis coefficient locations vector of length ncol(X)"
## [17] "vector p_coefs; vector (length = 1) of prior Gaussian means for parametric effects"
## [18] "vector p_taus; vector (length = 1) of prior Gaussian precisions for parametric effects"
## [19] "vector sp; inverse exponential location priors for smoothing parameters: s(season); s(season,series)1; s(season,series)2"
## [20] "___________values ranging 5 - 50 are a good start"
## [21] ""
## [22] ""
## [23] "#### Begin model ####"
## [24] "model {"
## [25] ""
## [26] "## GAM linear predictor"
## [27] "eta <- X %*% b"
## [28] ""
## [29] "## mean expectations"
## [30] "for (i in 1:n) {"
## [31] "for (s in 1:n_series) {"
## [32] "mu[i, s] <- exp(eta[ytimes[i, s]] + trend[i, s])"
## [33] "}"
## [34] "}"
## [35] ""
## [36] "## latent factors evolve as time series with penalised precisions;"
## [37] "## the penalty terms force any un-needed factors to evolve as flat lines"
## [38] "for (j in 1:n_lv) {"
## [39] "LV[1, j] ~ dnorm(0, penalty[j])"
## [40] "}"
## [41] ""
## [42] "for (i in 2:n) {"
## [43] "for (j in 1:n_lv) {"
## [44] "LV[i, j] ~ dnorm(ar1[j]*LV[i - 1, j], penalty[j])"
## [45] "}"
## [46] "}"
## [47] ""
## [48] "## AR components"
## [49] "for (s in 1:n_lv) {"
## [50] "ar1[s] ~ dnorm(0, 10)"
## [51] "}"
## [52] ""
## [53] "## shrinkage penalties for each factor's precision parameter act to squeeze"
## [54] "## the entire factor toward a flat white noise process if supported by"
## [55] "## the data. The prior for individual factor penalties allows each factor to possibly"
## [56] "## have a relatively large penalty, which shrinks the prior for that factor's variance"
## [57] "## substantially. Penalties increase exponentially with the number of factors following"
## [58] "## Welty, Leah J., et al. Bayesian distributed lag models: estimating effects of particulate"
## [59] "## matter air pollution on daily mortality Biometrics 65.1 (2009): 282-291."
## [60] "pi ~ dunif(0, n_lv)"
## [61] "X2 ~ dnorm(0, 1)T(0, )"
## [62] ""
## [63] "# eta1 controls the baseline penalty"
## [64] "eta1 ~ dunif(-1, 1)"
## [65] ""
## [66] "# eta2 controls how quickly the penalties exponentially increase"
## [67] "eta2 ~ dunif(-1, 1)"
## [68] ""
## [69] "for (t in 1:n_lv) {"
## [70] "X1[t] ~ dnorm(0, 1)T(0, )"
## [71] "l.dist[t] <- max(t, pi[])"
## [72] "l.weight[t] <- exp(eta2[] * l.dist[t])"
## [73] "l.var[t] <- exp(eta1[] * l.dist[t] / 2) * 1"
## [74] "theta.prime[t] <- l.weight[t] * X1[t] + (1 - l.weight[t]) * X2[]"
## [75] "penalty[t] <- max(0.0001, theta.prime[t] * l.var[t])"
## [76] "}"
## [77] ""
## [78] "## latent factor loadings: standard normal with identifiability constraints"
## [79] "## upper triangle of loading matrix set to zero"
## [80] "for (j in 1:(n_lv - 1)) {"
## [81] "for (j2 in (j + 1):n_lv) {"
## [82] "lv_coefs[j, j2] <- 0"
## [83] "}"
## [84] "}"
## [85] ""
## [86] "## positive constraints on loading diagonals"
## [87] "for (j in 1:n_lv) {"
## [88] "lv_coefs[j, j] ~ dnorm(0, 1)T(0, 1);"
## [89] "}"
## [90] ""
## [91] "## lower diagonal free"
## [92] "for (j in 2:n_lv) {"
## [93] "for (j2 in 1:(j - 1)) {"
## [94] "lv_coefs[j, j2] ~ dnorm(0, 1)T(-1, 1);"
## [95] "}"
## [96] "}"
## [97] ""
## [98] "## other elements also free"
## [99] "for (j in (n_lv + 1):n_series) {"
## [100] "for (j2 in 1:n_lv) {"
## [101] "lv_coefs[j, j2] ~ dnorm(0, 1)T(-1, 1);"
## [102] "}"
## [103] "}"
## [104] ""
## [105] "## trend evolution depends on latent factors"
## [106] "for (i in 1:n) {"
## [107] "for (s in 1:n_series) {"
## [108] "trend[i, s] <- inprod(lv_coefs[s,], LV[i,])"
## [109] "}"
## [110] "}"
## [111] ""
## [112] "## likelihood functions"
## [113] "for (i in 1:n) {"
## [114] "for (s in 1:n_series) {"
## [115] "y[i, s] ~ dnegbin(rate[i, s], r[s])"
## [116] "rate[i, s] <- ifelse((r[s] / (r[s] + mu[i, s])) < min_eps, min_eps,"
## [117] "(r[s] / (r[s] + mu[i, s])))"
## [118] "}"
## [119] "}"
## [120] ""
## [121] "## complexity penalising prior for the overdispersion parameter;"
## [122] "## where the likelihood reduces to a 'base' model (Poisson) unless"
## [123] "## the data support overdispersion"
## [124] "for (s in 1:n_series) {"
## [125] "r[s] <- 1 / r_raw[s]"
## [126] "r_raw[s] ~ dnorm(0, 0.1)T(0, )"
## [127] "}"
## [128] ""
## [129] "## posterior predictions"
## [130] "for (i in 1:n) {"
## [131] "for (s in 1:n_series) {"
## [132] "ypred[i, s] ~ dnegbin(rate[i, s], r[s])"
## [133] "}"
## [134] "}"
## [135] ""
## [136] "## GAM-specific priors"
## [137] "## parametric effect priors (regularised for identifiability)"
## [138] "for (i in 1:1) { b[i] ~ dnorm(p_coefs[i], p_taus[i]) }"
## [139] "## prior for s(season)..."
## [140] "K1 <- S1[1:10,1:10] * lambda[1]"
## [141] "b[2:11] ~ dmnorm(zero[2:11],K1)"
## [142] "## prior for s(season,series)..."
## [143] "for (i in c(12:15,17:20,22:25,27:30)) { b[i] ~ dnorm(0, lambda[2]) }"
## [144] "for (i in c(16,21,26,31)) { b[i] ~ dnorm(0, lambda[3]) }"
## [145] "## smoothing parameter priors..."
## [146] "for (i in 1:3) {"
## [147] "lambda[i] ~ dexp(1/sp[i])T(0.0001, )"
## [148] "rho[i] <- log(lambda[i])"
## [149] "}"
## [150] "}"
Inspection of the dynamic factors and their relative contributions indicates that the first factor is by far the most important
plot_mvgam_factors(trends_mod3)
Model 3 (with the dynamic trend) should provide far superior forecasts than relying only on the estimated smooths. Inspect the model summary (note again that p-value approximations are a work in progress here and so may not be totally reliable).
summary(trends_mod3)
## GAM formula:
## y ~ s(season, k = 12, m = 2, bs = "cc") + s(season, series, k = 5,
## bs = "fs", m = 1)
##
## Family:
## Negative Binomial
##
## Link function:
## log
##
## Trend model:
## AR1
##
## N latent factors:
## 4
##
## N series:
## 4
##
## N observations:
## 412
##
## Status:
## Fitted using runjags::run.jags()
##
## Dispersion parameter estimates:
## 2.5% 50% 97.5% Rhat n.eff
## r[1] 23.44483 92.20777 2190.3622 1.05 8000
## r[2] 13.59015 71.73257 2128.1592 1.12 8000
## r[3] 14.14273 34.52054 357.1247 1.04 6028
## r[4] 29.49299 103.89077 2211.4826 1.29 8000
##
## GAM smooth term estimated degrees of freedom:
## edf df
## s(season) 14.18 10
## s(season,series) 22.59 5
##
## GAM coefficient (beta) estimates:
## 2.5% 50% 97.5% Rhat n.eff
## (Intercept) 1.34924545 2.703309200 3.71066919 1.47 19
## s(season).1 -0.47707137 -0.207967954 0.07357554 1.03 67
## s(season).2 -0.80481158 -0.575573184 -0.34976897 1.04 122
## s(season).3 -0.85497757 -0.536350033 -0.27135299 1.06 84
## s(season).4 -0.72342582 -0.354119845 -0.04479177 1.11 74
## s(season).5 -0.57041217 -0.246856875 0.11932560 1.13 70
## s(season).6 -0.34210857 -0.012633775 0.35171011 1.09 61
## s(season).7 0.05311111 0.407080293 0.70568856 1.03 62
## s(season).8 0.43187763 0.737129106 1.08565532 1.01 68
## s(season).9 0.43021685 0.667949530 0.94935042 1.05 77
## s(season).10 -0.01437468 0.248777124 0.50765227 1.11 75
## s(season,series).1 -0.05547047 0.152213067 0.36405461 1.01 137
## s(season,series).2 -0.09604171 0.145641446 0.40452489 1.03 131
## s(season,series).3 -0.35372591 -0.071622532 0.21806825 1.12 58
## s(season,series).4 0.01835299 0.122393552 0.24810137 1.03 76
## s(season,series).5 -0.59377899 0.499852413 1.58189525 1.44 20
## s(season,series).6 -0.39871471 -0.148875380 0.10275414 1.00 312
## s(season,series).7 -0.21222443 0.075283187 0.37500104 1.01 399
## s(season,series).8 -0.28138774 -0.001613888 0.29469212 1.09 103
## s(season,series).9 -0.13924001 -0.015380848 0.12312927 1.02 124
## s(season,series).10 -2.26922638 -1.270138354 -0.19550099 1.36 26
## s(season,series).11 -0.11432861 0.100782212 0.32801302 1.01 178
## s(season,series).12 -0.35735797 -0.098988973 0.15499292 1.03 198
## s(season,series).13 -0.13192009 0.150306937 0.44804053 1.11 71
## s(season,series).14 -0.03026027 0.080095162 0.20694442 1.03 80
## s(season,series).15 -1.28557720 -0.133320092 0.95349892 1.43 24
## s(season,series).16 -0.37138238 -0.164824320 0.05181315 1.01 157
## s(season,series).17 -0.25258639 -0.006458399 0.24473930 1.02 136
## s(season,series).18 -0.26605127 0.011304219 0.30051785 1.11 65
## s(season,series).19 -0.13387540 -0.030905679 0.09378187 1.03 85
## s(season,series).20 -0.37894427 0.586885667 1.61507634 1.39 25
##
## GAM smoothing parameter (rho) estimates:
## 2.5% 50% 97.5% Rhat n.eff
## s(season) 4.976007 6.184579 7.1061105 1.00 586
## s(season,series) 2.123326 2.938408 3.6020307 1.01 1480
## s(season,series)2 -3.460605 -1.982924 -0.9868051 1.00 8030
##
## Latent trend parameter estimates:
## 2.5% 50% 97.5% Rhat n.eff
## ar1[1] -0.2653269 0.11504495 0.5025188 1.00 2744
## ar1[2] 0.5831062 0.93133754 1.0244630 1.02 134
## ar1[3] -0.4983205 0.06100642 0.6626636 1.01 1027
## ar1[4] -0.5205574 0.02821556 0.6198868 1.00 1232
##
Look at Dunn-Smyth residuals for some series from this preferred model to ensure that our dynamic factor process has captured most of the temporal dependencies in the observations
plot_mvgam_resids(trends_mod3, series = 1)
plot_mvgam_resids(trends_mod3, series = 2)
plot_mvgam_resids(trends_mod3, series = 3)
plot_mvgam_resids(trends_mod3, series = 4)
Perform posterior predictive checks to see if the model is able to
simulate data that looks realistic and unbiased by examining simulated
kernel densities for posterior predictions (yhat) compared
to the density of the observations (y). This will be
particularly useful for examining whether the Negative Binomial
observation model is able to produce realistic looking simulations for
each individual series.
ppc(trends_mod3, series = 1, type = "hist")
ppc(trends_mod3, series = 2, type = "hist")
ppc(trends_mod3, series = 3, type = "hist")
ppc(trends_mod3, series = 4, type = "hist")
Look at traceplots for the smoothing parameters
(rho)
plot_mvgam_trace(object = trends_mod3, param = "rho")
Plot posterior predictive distributions for the training and testing periods for each series
plot_mvgam_fc(object = trends_mod3, series = 1,
data_test = trends_data$data_test)
## Out of sample DRPS:
## [1] 43.11525
##
plot_mvgam_fc(object = trends_mod3, series = 2,
data_test = trends_data$data_test)
## Out of sample DRPS:
## [1] 14.53877
##
plot_mvgam_fc(object = trends_mod3, series = 3,
data_test = trends_data$data_test)
## Out of sample DRPS:
## [1] 38.92127
##
plot_mvgam_fc(object = trends_mod3, series = 4,
data_test = trends_data$data_test)
## Out of sample DRPS:
## [1] 54.52083
##
Plot posterior distributions for the latent trend estimates, again for the training and testing periods
plot_mvgam_trend(object = trends_mod3, series = 1,
data_test = trends_data$data_test)
plot_mvgam_trend(object = trends_mod3, series = 2,
data_test = trends_data$data_test)
plot_mvgam_trend(object = trends_mod3, series = 3,
data_test = trends_data$data_test)
plot_mvgam_trend(object = trends_mod3, series = 4,
data_test = trends_data$data_test)
Given that we fit a model with hierarchical seasonality, the seasonal smooths are able to deviate from one another (though they share the same wiggliness and all deviate from a common ‘global’ seasonal function)
plot_mvgam_smooth(object = trends_mod3, series = 1,
smooth = "season")
plot_mvgam_smooth(object = trends_mod3, series = 2,
smooth = "season")
plot_mvgam_smooth(object = trends_mod3, series = 3,
smooth = "season")
plot_mvgam_smooth(object = trends_mod3, series = 4,
smooth = "season")
Plot posterior mean estimates of latent trend correlations. These correlations are more useful than looking at latent factor loadings, for example to inspect ordinations. This is because the orders of the loadings (although constrained for identifiability purposes) can vary from chain to chain
correlations <- lv_correlations(object = trends_mod3)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.0.5
mean_correlations <- correlations$mean_correlations
mean_correlations[upper.tri(mean_correlations)] <- NA
mean_correlations <- data.frame(mean_correlations)
ggplot(mean_correlations %>% tibble::rownames_to_column("series1") %>%
tidyr::pivot_longer(-c(series1), names_to = "series2",
values_to = "Correlation"), aes(x = series1,
y = series2)) + geom_tile(aes(fill = Correlation)) +
scale_fill_gradient2(low = "darkred",
mid = "white", high = "darkblue",
midpoint = 0, breaks = seq(-1, 1,
length.out = 5), limits = c(-1,
1), name = "Trend\ncorrelation") +
labs(x = "", y = "") + theme_dark() +
theme(axis.text.x = element_text(angle = 45,
hjust = 1))
There is certainly some evidence of positive trend correlations for a few of these search terms, which is not surprising given how similar some of them are and how closely linked they should be to interest about tick paralysis in Queensland. Plot some STL decompositions of these series to see if these trends are noticeable in the data
plot(stl(ts(as.vector(series$`tick paralysis`),
frequency = 12), "periodic"))
plot(stl(ts(as.vector(series$`paralysis tick dog`),
frequency = 12), "periodic"))
plot(stl(ts(as.vector(series$`dog tick`),
frequency = 12), "periodic"))
plot(stl(ts(as.vector(series$`tick bite`),
frequency = 12), "periodic"))
Forecast period posterior predictive checks suggest that the model still has room for improvement:
ppc(trends_mod3, series = 1, type = "hist",
data_test = trends_data$data_test)
ppc(trends_mod3, series = 1, type = "mean",
data_test = trends_data$data_test)
ppc(trends_mod3, series = 2, type = "hist",
data_test = trends_data$data_test)
ppc(trends_mod3, series = 2, type = "mean",
data_test = trends_data$data_test)
ppc(trends_mod3, series = 3, type = "hist",
data_test = trends_data$data_test)
ppc(trends_mod3, series = 3, type = "mean",
data_test = trends_data$data_test)
ppc(trends_mod3, series = 4, type = "hist",
data_test = trends_data$data_test)
ppc(trends_mod3, series = 4, type = "mean",
data_test = trends_data$data_test)
Other next steps could involve devising a more goal-specific set of posterior predictive checks (see this paper by Gelman et al and relevant works by Betancourt for examples) and compare out of sample Discrete Rank Probability Scores for this model and other versions for the latent trends (i.e. AR2, AR3, Random Walk)